Key Concepts for Parallel Out-of-Core LU Factorization

نویسندگان

  • Jack J. Dongarra
  • Sven Hammarling
  • David W. Walker
چکیده

This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left-looking variant of the LU factorization algorithm is shown to require less I/O to disk than the rightlooking variant, and is used to develop a parallel, out-of-core implementation. This implementation makes use of a small library of parallel I/O routines, together with ScaLAPACK and PBLAS routines. Results for runs on an Intel Paragon are presented and interpreted using a simple performance model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel LU Factorization on GPU Cluster

This paper describes our progress in developing software for performing parallel LU factorization of a large dense matrix on a GPU cluster. Three approaches, with increasing software complexity, are considered: (i) a naive “thunking” approach that links the existing parallel ScaLAPACK software library with cuBLAS through a software emulation layer; (ii) a more intrusive magmaBLAS implementation...

متن کامل

The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines

This paper describes the design and implementation of three core factorization routines—LU, QR, and Cholesky—included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. The full matrix is stored on disk and the factorization routines transfer sub-matrice panels into memory. The ‘l...

متن کامل

THE USE OF SEMI INHERITED LU FACTORIZATION OF MATRICES IN INTERPOLATION OF DATA

The polynomial interpolation in one dimensional space R is an important method to approximate the functions. The Lagrange and Newton methods are two well known types of interpolations. In this work, we describe the semi inherited interpolation for approximating the values of a function. In this case, the interpolation matrix has the semi inherited LU factorization.

متن کامل

The Design and Implementation of the Parallel Out - of - coreScaLAPACK

This paper describes the design and implementation of three core factorization routines | LU, QR and Cholesky | included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to t entirely in physical memory. An image of the full matrix is maintained on disk and the factorization routines transfer sub-matrices into mem...

متن کامل

High-Performance Out-of-Core Sparse LU Factorization

We present an out-of-core sparse nonsymmetric LU -factorization algorithm with partial pivoting. We have implemented the algorithm and our experiments show that it can easily factor matrices whose factors are larger than main memory at rates comparable to those of an in-core solver. The algorithm is novel in several respects, including the use of panels that are larger than memory and the use o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Computing

دوره 23  شماره 

صفحات  -

تاریخ انتشار 1997